Photo-real lips synthesis with trajectory-guided sample selection
نویسندگان
چکیده
In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. It renders a smooth and natural video of articulators in sync with given speech signals. An audio-visual database is used to train a statistical Hidden Markov Model (HMM) of lips movement first and the trained model is then used to generate a visual parameter trajectory of lips movement for given speech signals, all in the maximum likelihood sense. The HMM generated trajectory is then used as a guide to select, in the original training database, an optimal sequence of mouth images which are then stitched back to a background head video. The whole procedure is fully automatic and data driven. With an audio/video footage as short as 20 minutes from a speaker, the proposed system can synthesize a highly photo-real video in sync with the given speech signals. This system won the FIRST place in the Audio-Visual match contest in LIPS2009 Challenge, which was perceptually evaluated by recruited human subjects. http://www.lips2008.org/
منابع مشابه
Synthesizing photo-real talking head via trajectory-guided sample selection
In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. It renders a smooth and natural video of articulators in sync with given speech signals. An audio-visual database is used to train a statistical Hidden Markov Model (HMM) of lips movement first and the trained model is then used to generate a visual parameter trajec...
متن کاملSliding Window-based Speech-to-Lips Conversion with Low Delay
The goal of a good speech-to-lips conversion system is to synthesize high quality, realistic lips movement which is time synchronized with the input speech. Previously, the maximum probability estimation of visual trajectory by Gaussian Mixture Model (GMM) has been successfully proposed and tested for speech-to-lips conversion. It works as a sentence level batch process that convert acoustic sp...
متن کاملA minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion
High quality speech-to-lips conversion, investigated in this work, renders realistic lips movement (video) consistent with input speech (audio) without knowing its linguistic content. Instead of memoryless framebased conversion, we adopt maximum likelihood estimation of the visual parameter trajectories using an audio-visual joint Gaussian Mixture Model (GMM). We propose a minimum converted tra...
متن کاملGreen synthesis of nano size CoFe2O4 using Chenopodium album leaf extract for photo degradation of organic pollutants
Green synthesis of nanoparticles makes use of environmental friendly, non-toxic and safe reagents. In this study, we synthesised CoFe2O4 in a green approach, using leaf extract of Chenopodium album. The structure of the synthesised sample were analyzed by X-ray diffraction methods. The synthesised Photocatalyst was applied for photo degradation of methy orange as a reliable model pollutant. The...
متن کاملReal-Time Exemplar-Based Face Sketch Synthesis
This paper proposes a simple yet effective face sketch synthesis method. Similar to existing exemplar-based methods, a training dataset containing photo-sketch pairs is required, and a K-NN photo patch search is performed between a test photo and every training exemplar for sketch patch selection. Instead of using the Markov Random Field to optimize global sketch patch selection, this paper for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010